Dependency manager for databases

ABSTRACT

The present disclosure relates to in-memory databases or search engines using a dependency manager or configuration manager for maintaining configuration in the database system. The system may include a supervisor that may request and receive data from dependency manager, where the supervisor may be linked to other components in the system. The dependency manager may be used as a container for data metadata, and software components, which may be used in the system configuration. The configuration may be developed through a dependency system, where the dependency manager may keep an entire dependency tree for all software and data in the system. Similarly, dependency manager may create a deployable package to guarantee deployment integrity and to ensure a successful execution of any suitable software and data in the system.

CROSS-REFERENCE TO RELATED APPLICATIONS

This patent application is a continuation of U.S. Non-Provisionalapplication Ser. No. 14/558,009, filed on Dec. 2, 2014, which claims abenefit of priority to U.S. Provisional Application 61/910,860, filed onDec. 2, 2013, which are hereby fully incorporated by reference in theirentirety.

This application is related to U.S. patent application Ser. No.14/557,794, entitled “Method for Disambiguating Features in UnstructuredText,” filed Dec. 2, 2014; U.S. patent application Ser. No. 14/558,300,entitled “Event Detection Through Text Analysis Using Trained EventTemplate Models,” filed Dec. 2, 2014; U.S. patent application Ser. No.14/557,807, entitled “Method for Facet Searching and SearchSuggestions,” filed Dec. 2, 2014; U.S. patent application Ser. No.14/558,254, entitled “Design and Implementation of Clustered In-MemoryDatabase,” filed Dec. 2, 2014; U.S. patent application Ser. No.14/557,827, entitled “Real-Time Distributed In Memory SearchArchitecture,” filed Dec. 2, 2014; U.S. patent application Ser. No.14/557,951, entitled “Fault Tolerant Architecture for DistributedComputing Systems,” filed Dec. 2, 2014; U.S. patent application Ser. No.14/558,055, entitled “Pluggable Architecture for Embedding Analytics inClustered In-Memory Databases,” filed Dec. 2, 2014; U.S. patentapplication Ser. No. 14/558,101, entitled “Non-Exclusionary SearchWithin In-Memory Databases,” filed Dec. 2, 2014; and U.S. patentapplication Ser. No. 14/557,900, entitled “Data record compression withprogressive and/or selective decompression,” filed Dec. 2, 2014; each ofwhich are incorporated herein by reference in their entirety.

TECHNICAL FIELD

The present disclosure relates in general to databases, and moreparticularly, to a dependency manager that may be used for in-memorydatabases.

BACKGROUND

Package management systems may be designed to save organizations timeand money through remote administration and software distributiontechnology that may eliminate the need for manual installation andupdates for any suitable component, such as, software, operating systemcomponent, application program, support library, application data,general documentation, and other data, from a system or process. Oneconventional approach in the art related to package management systemmay be the Red Hat package manager (RPM). Package managers may present auniform way to install and/or update software programs and associatedcomponents.

To install a set of software or data packages, a package manager mayorder the packages and its dependent packages in topological order ontoa graph. Subsequently, the package manager may collect the packages atthe bottom of the graph and install these packages first. Finally, thepackage manager may move up the graph and install the next set ofpackages.

However, the conventional approach in the art related to databasemanagement systems refers that some package managers may only keep thesoftware configuration in the system, but may not support metadata orprimary data collection dependencies. In a database, particularly anin-memory database or other distributed storage architectures,deployment focuses as much on data as software, and thereforemaintaining dependency trees required for data deployment are essential.

Conventional technologies may automate deployment, installation, andconfiguration of software components and associated dependencies, acrossa cluster of one or more computers in a convectional distributedcomputing architectures. What is needed is a solution to automate thedeployment, installation, and configuration of data, metadata andsoftware of a primary datastore of a distributed database, in adistributed computing architecture, such as in-memory databases andother distributed data platforms. Moreover, because conventionalsolutions focus on deploying a static set of services and data,conventional systems lack the ability to detect service or data failuresand then automatically recover from those failures by moving a packageof data, metadata and software to other available nodes in thedistributed system.

For the aforementioned reasons, there is a need for an improved packagemanagement application to guarantee/keep a successful execution of thesystem configuration and dependencies into a data management system.

SUMMARY

Disclosed herein are systems and methods for handling dependenciesduring the process of installing, upgrading, and configuring differentsoftware, data or metadata packages for any suitable database or searchengine. The systems and methods may automate processes for deploying,installing, and configuring various data, metadata, and software storedin a primary datastore of the distributed-computing system, such as adistributed system hosting an in-memory database, or other types ofdistributed data platforms. Exemplary embodiments may describe systemsand methods in which a dependency manager (configuration management) maybe linked directly to a supervisor (systems management), wheresupervisor may maintain the system in a fully functional manner, and mayaccept configuration requests to make changes in the system.

In one embodiment, a computer-implemented method comprises transmitting,by a computer of a distributed computing system, a request for amachine-readable deployable-package file associated with a target nodeof the system to a dependency manager node comprising a non-transitorymachine-readable storage medium storing one or more deployable packagefiles associated respectively with one or more nodes of the systemaccording to dependency tree; transmitting, by the computer, thedeployable package file to the target node in response to receiving thedeployable package file from the dependency node, wherein the deployablepackage file associated with the target node contains a set of one ormore dependency files based on the dependency tree; and instructing, bythe computer, the target node to install the set of dependencies in thedeployable package onto the target node.

In another embodiment, a computer-implemented method comprisesdetermining, by a computer, a set of one or more dependency files to beinstalled onto a target node using a dependency tree associated with thetarget node responsive to receiving a request to configure the targetnode from a supervisor node; fetching, by the computer, each of thedependency files of the set of one or more dependency files from atleast one dataframe comprising non-transitory machine-readable storagemedium storing one or more dependency files; generating, by thecomputer, a deployable package file comprising the set of one or moredependency files; and transmitting, by the computer, the deployablepackage file to the supervisor node.

In another embodiment, a database management system comprises one ormore nodes comprising a non-transitory machine-readable storage memorystoring one or more dependency files, and a processor monitoring astatus of the one or more dependency files, wherein each respectivedependency file is a component of the node having a comparativerelationship with a corresponding component installed on a second node;one or more supervisor nodes comprising a processor monitoring a statusfor each of the one or more nodes and configured to transmit adeployable package comprising a set of dependencies files to each of thenodes based on the status of each respective node; and one or moredependency manager nodes comprising a non-transitory machine-readablestorage medium storing one or more dependency tree files associated withthe one or more nodes, and a processor configured to compile adeployable package file in accordance with a dependency tree associatedwith a node, wherein the deployable package file comprises a set of oneor more dependencies files stored on at least one data frame, andwherein the dependency manager node determines a dependency to includein the deployable package based on a dependency tree associated with anode targeted to receive the deployable package.

Numerous other aspects, features of the present disclosure may be madeapparent from the following detailed description. Additional featuresand advantages of an embodiment will be set forth in the descriptionwhich follows, and in part will be apparent from the description. Theobjectives and other advantages of the invention will be realized andattained by the structure particularly pointed out in the exemplaryembodiments in the written description and claims hereof as well as theappended drawings.

BRIEF DESCRIPTION OF DRAWINGS

The present disclosure can be better understood by referring to thefollowing figures. The components in the figures are not necessarily toscale, emphasis instead being placed upon illustrating the principles ofthe disclosure. In the figures, reference numerals designatecorresponding parts throughout the different views.

FIG. 1 illustrates a block diagram connection of supervisor anddependency manager, according to an embodiment.

FIG. 2 is a flowchart diagram of a configuration process, according toan embodiment.

FIG. 3 illustrates a block diagram of dependencies used for theconfiguration of a system, according to an embodiment.

FIG. 4 is a flowchart showing fault handling by a distribute computingsystem, according to an exemplary method embodiment.

DEFINITIONS

As used here, the following terms may have the following definitions:

“Dependency Tree” refers to a type of data structure, which may show therelationship of partitions, modules, files, or data, among others.

“Deployable Package” refers to a set of information, which may be usedin the configuration of modules, partitions, files, or data, amongothers.

“Node” refers to a computer hardware configuration suitable for runningone or more modules.

“Cluster” refers to a set of one or more nodes.

“Module” refers to a computer software component suitable for carryingout one or more defined tasks.

“Partition” refers to an arbitrarily delimited portion of records of acollection.

“Collection” refers to a discrete set of records.

“Record” refers to one or more pieces of information that may be handledas a unit.

“Node Manager” refers to a module configured to at least perform one ormore commands on a node and communicate with one or more supervisors.

“Heartbeat” refers to a signal communicating at least one or morestatuses to one or more supervisors.

“Supervisor” refers to a configuration/monitoring module that may createand execute plans for change in response to changes one or more statusesor to external requests for change.

“Database” refers to any system including any combination of clustersand modules suitable for storing one or more collections and suitable toprocess one or more queries.

“Dependency Manager” refers to a module configured to at least includeone or more dependency trees associated with one or more modules,partitions, or suitable combinations, in a system; to at least receive arequest for information relating to any one or more suitable portions ofsaid one or more dependency trees; and to at least return one or moreconfigurations derived from said portions.

DETAILED DESCRIPTION

The present disclosure is here described in detail with reference toembodiments illustrated in the drawings, which form a part here. Otherembodiments may be used and/or other changes may be made withoutdeparting from the spirit or scope of the present disclosure. Theillustrative embodiments described in the detailed description are notmeant to be limiting of the subject matter presented here.

Conventional solutions focus on deploying a fairly static set ofservices, and so conventional solutions typically lack the functionalityrequired to detect failures of system components and then automaticallyrecover by moving a package of data, metadata, and/or software, to otheravailable nodes in the distributed system.

According to one embodiment, a dependency manager may be used as acontainer for the maintenance or configuration of any suitable softwareor data component in the system. Those configurations may be driven bynew data, metadata or software updates in a release process.

In another embodiment, dependency manager may include a dependency treefor releasing a releasable file, such as releases of data, metadata, orsoftware, or any other component of the system, to the system. Thereleasable file may require a configuration for dependencies that may bedirectly linked or wrapped around another component that is beingconfigured, and so additional components or configuration may berequired. Similarly, the dependency manager may keep a system-leveldependency tree for all of the software and data components releasedinto the system.

In a further embodiment, if any suitable software or data component isreleased in a dependency tree, dependency manager may create adeployable package to guarantee deployment integrity. That is, thedeployment integrity may ensure a successful execution of any suitablesoftware or data component, providing a desired result.

Reference will now be made to the exemplary embodiments illustrated inthe drawings, and specific language will be used here to describe thesame. It will nevertheless be understood that no limitation of the scopeof the invention is thereby intended. Alterations and furthermodifications of the inventive features illustrated here, and additionalapplications of the principles of the inventions as illustrated here,which would occur to one skilled in the relevant art and havingpossession of this disclosure, are to be considered within the scope ofthe invention.

FIG. 1 illustrates a block diagram connection 100 of supervisor 102 anddependency manager 104. Generally, supervisor 102 may monitor the systemand/or execute processes and tasks that maintain an operating state forthe system. Supervisor 102 may accept any suitable configurationrequests to make changes in the system. Software or data configurationsmay be handled by nodes executing a dependency manager 104 softwaremodule or a supervisor 102 software module; however, the deployablepackage may be provided from a separate data frame. The separate dataframe is a non-transitory machine-readable storage medium storing one ormore releasable files used in preparing a deployable package accordingto a configuration.

According to one embodiment, the dependency manager 104 may be used as anon-transitory machine-readable storage medium containing themaintenance or configuration of any suitable software or data componentin the system. Those configurations may be driven by new data, metadataor software updates in a release process.

The dependency manager 104 may play a role in configurations required bysome processes in the system. That is, dependency manager 104 may bedirectly connected with supervisor 102 in order to provide the suitabledependencies, otherwise referred to as “packages,” “configurations,”“components,” and/or “files,” for the partitions, which may be used toupdate any suitable collection. Furthermore, supervisor 102 may belinked to one or more dependency managers 104 and may additionally belinked to one or more other supervisors 102, where additionalsupervisors 102 may be linked to other components in the system.

FIG. 2 is a flowchart diagram 200 of a configuration process in thesystem.

According to another embodiment, the configuration process ormaintenance process may include the information regarding whatdependencies a module may have and needs to be deployed along with themodule. The required files may be fetched from a separate non-transitorymachine-readable storage, or “data frame.” In some embodiments, thisdata frame may be external from the system architecture; for example, inthe case of third-party vendor providing software updates. Thedependencies in a suitable deployable package may include differenttypes of files, data, or software that are directly linked or wrappedaround the module or the partition that is being configured. Theconfiguration process may include different steps step 202, 204, 206,208, 210, and 212. The configuration process 200 may begin in responseto requests requiring the system to install or update, data or softwarecomponents.

In a first step 202, processors of the system may automatically detect asituation that may trigger the configuration process 200 sequence/steps.

In some embodiments, in step 202, a node of the system executing asupervisor module may poll components of the system, such as nodemanager software modules, responsible for reporting a health update, or“status,” to the supervisor. In such embodiments, the supervisor mayautomatically detect failures throughout the system based on a lack of aheartbeat (HB) signal the supervisor expects to receive from any systemmodule, as defined by the system configuration. The supervisor may thentrigger configuration process 200, among other remedial processes, inresponse to detecting the missing HB signal.

In some embodiments, in step 202, a node of the system executing asupervisor module may trigger configuration process 200 when thesupervisor receives an external request for one or more changes in thesystem configuration, such as updates to a component or migration to newnode hardware.

In step 204, the supervisor may send a request to the dependency managerto retrieve one or more deployment packages associated with one or moremodules that are to be installed on the node. A deployment packagedefines each of the files and/or other materials required to satisfy thenode configuration according to the dependency manager. The deployablepackage may contain all required dependencies, including source anddestination information necessary for proper deployment and may containmodule properties needed to configure or start the module. A particulardependency may have its own dependencies, also defined in the dependencymanager, and therefore may be referred to as a dependency tree.

In step 206, the supervisor may transmit instructions to the dependencymanager to fetch the required deployment packages from a data framestoring the deployment package. The data frame may be any non-transitorymachine-readable storage media, which may be located on any suitablecomputing device communicatively coupled to a node executing thedependency manager. In some cases, when a deployment package isgenerated, the deployment package contains all dependencies for themodule being transmitted, as well as the source and destinationinformation needed to properly deploy the deployment package. Thedeployment package may also include one or more module properties neededto configure or start the deployment package. Deployment packages may begenerated through automated or manual processes. In manual example, asystem administrator may identify and/or create a deployment packagewith the requisite files and data. In an automated example, thesupervisor or dependency manager may automatically identify and/orgenerate the deployment package using the automatically identifiedfiles, which is usually accomplished through a test script generated bythe dependency manager, thereby yielding installation speeds anddistribution rates higher than could be done by a human.

In step 208, after the dependency manager receives the deploymentpackages from the data frame, the dependency manager may transmit thedeployable package to the node executing the supervisor that requestedthe deployment packages.

In step 210, the supervisor may send the deployable package to the nodemanager of the node requiring the configuration.

In step 212, the node manager may copy files, install, and/or executethe deployable package received from the supervisor, therebyimplementing the requisite maintenance, update, or configuration for thesystem.

FIG. 3 illustrates block diagram of dependencies 300 used for theconfiguration of a system. According to a further embodiment, theprocess for the maintenance or configuration of a system may includedifferent components, such as dependency manager 302, supervisor 304,search node 306, node manager 308, and dependency tree 310, amongothers.

A dependency tree 310 may include different types of files that may bedirectly linked or wrapped around a module or partition such that adependency may be the degree to which each member of a partition relieson each one of the other members in the partition. For instance,dependency tree 310 may include partition 1, which may depend onphonetic 1.0 and compression 1.0; subsequently, phonetic 1.0 may dependon software libraries (such as, processing DLL 1.0 and Input DLL 1.0),and compression 1.0 may depend on data-table 1.0 and so on.

The dependency manager 302 may store a dependency tree 310 associatedwith any releasable file of the system. In a further embodiment, if anysuitable software or data component is released to components indicatedwithin the dependency tree 310, the dependency manager 302 may create adeployable package from one or more files stored on a data frame.

Supervisor 304 may be linked to one or more dependency managers 302including one or more dependency trees 310 for one or more modules,partitions, or suitable combinations thereof. Supervisor 304 mayadditionally be linked to one or more supervisor 304, where additionalsupervisors 304 may be linked to other components in the system.

The foregoing method descriptions and the process flow diagrams areprovided merely as illustrative examples and are not intended to requireor imply that the steps of the various embodiments must be performed inthe order presented. As will be appreciated by one of skill in the artthe steps in the foregoing embodiments may be performed in any order.Words such as “then,” “next,” etc. are not intended to limit the orderof the steps; these words are simply used to guide the reader throughthe description of the methods. Although process flow diagrams maydescribe the operations as a sequential process, many of the operationscan be performed in parallel or concurrently. In addition, the order ofthe operations may be re-arranged. A process may correspond to a method,a function, a procedure, a subroutine, a subprogram, etc. When a processcorresponds to a function, its termination may correspond to a return ofthe function to the calling function or the main function.

The various illustrative logical blocks, modules, circuits, andalgorithm steps described in connection with the embodiments disclosedherein may be implemented as electronic hardware, computer software, orcombinations of both. To clearly illustrate this interchangeability ofhardware and software, various illustrative components, blocks, modules,circuits, and steps have been described above generally in terms oftheir functionality. Whether such functionality is implemented ashardware or software depends upon the particular application and designconstraints imposed on the overall system. Skilled artisans mayimplement the described functionality in varying ways for eachparticular application, but such implementation decisions should not beinterpreted as causing a departure from the scope of the presentinvention.

Embodiments implemented in computer software may be implemented insoftware, firmware, middleware, microcode, hardware descriptionlanguages, or any combination thereof. A code segment ormachine-executable instructions may represent a procedure, a function, asubprogram, a program, a routine, a subroutine, a module, a softwarepackage, a class, or any combination of instructions, data structures,or program statements. A code segment may be coupled to another codesegment or a hardware circuit by passing and/or receiving information,data, arguments, parameters, or memory contents. Information, arguments,parameters, data, etc. may be passed, forwarded, or transmitted via anysuitable means including memory sharing, message passing, token passing,network transmission, etc.

The actual software code or specialized control hardware used toimplement these systems and methods is not limiting of the invention.Thus, the operation and behavior of the systems and methods weredescribed without reference to the specific software code beingunderstood that software and control hardware can be designed toimplement the systems and methods based on the description herein.

When implemented in software, the functions may be stored as one or moreinstructions or code on a non-transitory computer-readable orprocessor-readable storage medium. The steps of a method or algorithmdisclosed herein may be embodied in a processor-executable softwaremodule which may reside on a computer-readable or processor-readablestorage medium. A non-transitory computer-readable or processor-readablemedia includes both computer storage media and tangible storage mediathat facilitate transfer of a computer program from one place toanother. A non-transitory processor-readable storage media may be anyavailable media that may be accessed by a computer. By way of example,and not limitation, such non-transitory processor-readable media maycomprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage,magnetic disk storage or other magnetic storage devices, or any othertangible storage medium that may be used to store desired program codein the form of instructions or data structures and that may be accessedby a computer or processor. Disk and disc, as used herein, includecompact disc (CD), laser disc, optical disc, digital versatile disc(DVD), floppy disk, and blu-ray disc where disks usually reproduce datamagnetically, while discs reproduce data optically with lasers.Combinations of the above should also be included within the scope ofcomputer-readable media. Additionally, the operations of a method oralgorithm may reside as one or any combination or set of codes and/orinstructions on a non-transitory processor-readable medium and/orcomputer-readable medium, which may be incorporated into a computerprogram product.

The preceding description of the disclosed embodiments is provided toenable any person skilled in the art to make or use the presentinvention. Various modifications to these embodiments will be readilyapparent to those skilled in the art, and the generic principles definedherein may be applied to other embodiments without departing from thespirit or scope of the invention. Thus, the present invention is notintended to be limited to the embodiments shown herein but is to beaccorded the widest scope consistent with the following claims and theprinciples and novel features disclosed herein.

While various aspects and embodiments have been disclosed, other aspectsand embodiments are contemplated. The various aspects and embodimentsdisclosed are for purposes of illustration and are not intended to belimiting, with the true scope and spirit being indicated by thefollowing claims.

FIG. 4 is a flowchart for fault handling 400.

The supervisor maintains the definition and configuration of all datacollections in the system, which may include settings per collectionthat indicate how many redundant copies of each partition are desired,how many times to try to restart failed components before moving them toanother node, etc. The supervisor also maintains a list of availablenodes and their resources, as provided by the node managers. From thatinformation, the supervisor computes a desired system state by mappingthe needed system modules to available nodes, while still complying withconfiguration settings. Fault handling 400 begins with supervisordetecting a module failure 402, where one or more supervisors may detectfailures of one or more modules by comparing the actual system state toa desired system state. In one or more embodiments, supervisors maydetect failure when one or more heartbeats from node managers or systemmodules are no longer detected. In one or more other embodiments,heartbeats from one or more modules may include status information aboutone or more other modules that may be interpreted by the one or moresupervisors.

A supervisor may store definitions of data collections and theconfigurations settings associated with the data collections. Thesupervisor may also store information about available system resources,as reported by node managers. The configuration information may includesettings per collection that indicate how many redundant copies of eachrespective partition are desired, how many times to try to restartfailed components before moving them to another node, among other. Fromall this information, the supervisor derives a ‘desired’ system statethat maps the needed system modules to available nodes, while stillcomplying with configuration settings. All this information isrepresented as JSON objects which may be stored as JSON files on disk,or in a predefined data collection within the IMDB.

The supervisor may then detect if the associated node manager isfunctioning 404.

If the node manager associated with the one or more failed modules isfunctioning as desired or according to a status quo configuration, thensupervisor may send one or more commands to the node manager instructingthe node manager to attempt to restart the one or more failed modules,in a step 406.

The supervisor may then check if module is restored 408, and if so theprocess may proceed to end 410. In some implementations, the firstaction of any module is to report a status via heartbeats to one or moreavailable supervisors. If it is determined that module function is notrestored, as indicated by heartbeats, the supervisor may determine ifthe restart threshold has been reached 412. The threshold number ofattempts is a configuration setting per collection, which may be set bythe system administrator and stored with the supervisor. The supervisordetermines that a module has failed and should be restarted or moved toanother node. The supervisor sends commands to the node manager. If thenumber of attempts has not been reached, the node manager attempts torestart module 406.

If the threshold has been reached, the supervisor determines the nextsuitable node to place the module 414 and the supervisor requests thenode manager on the new node to stage all module dependencies and startthe current module 416.

The supervisor may then check if module is restored 418, and if so theprocess may proceed to end 410. If the module is not restored, thesystem may check if the restart threshold for the new node has beenreached 420. If the threshold has not been reached, the supervisorrequests the node manager on the new node to stage and start the currentmodule 416.

Otherwise, the supervisor may check if the global node retry thresholdhas been reached 422. This value is also defined by a systemadministrator and may be stored with the supervisor in a script, or asJSON or similar data structure object. If the threshold has not beenreached, the supervisor determines the next suitable node to place themodule 414 and attempts to restart the node on the new node. If theglobal threshold has been reached, the system may then raise an alarmindicating module failure 424.

If the supervisor detects that the associated node manager is notfunctioning based on the corresponding heartbeats, as indicated by alack of heartbeats or heartbeats from the node manager indicating afailed state, the supervisor selects a module associated with the nodewith a failed node manager 426. Then, the supervisor determines the nextsuitable node to place the module 428. Afterwards, the supervisorrequests the node manager on the new node to stage and start the currentmodule 430.

The supervisor may then check if module is restored 432. If the moduleis not restored, supervisor checks if the restart threshold for the newnode has been reached 434. If the threshold has not been reached, thesupervisor requests the node manager on the new node to stage and startthe current module 430.

If the threshold has been reached, the supervisor then checks if theglobal node retry threshold has been reached 436. If the threshold hasnot been reached, the supervisor determines the next suitable node toplace the module 428 and attempts to restart the node on the new node.If the global threshold has been reached, the system may then raise analarm indicating module failure 438.

Otherwise, if the module is restored, the supervisor then checks ifthere are more modules to be migrated off the failed node 440. If a nodehas failed, the supervisor is configured to migrate all of the servicesthat had been running on the failed node 440, as defined in the desiredstate. The supervisor will calculate a new desired state without thefailed node 440 and will need to migrate services accordingly. In someimplementations, the supervisor may select a module associated with thenode having a failed node manager 426 and the node manager attempts tostage and start the module.

If the supervisor determines no more modules are to be migrated, theprocess may end 410.

In one or more embodiments, a node may fail and a supervisor maydetermine, based on information from node manager heartbeats, that nonodes have available resources. In some implementations, the nodemanagers report their available resources in each correspondingheartbeat. The supervisor may then attempt to make resources availablein other nodes in the system while maintaining a desired redundancy. Inone or more embodiments, resources may be made available by unloading amodule or partition. The supervisor may then load the desired module orpartition on the available resources.

Example #1 illustrates what happens if a single module fails due to someresource no longer available on the node but the node itself is nototherwise adversely affected.

In this case, when the module fails the heartbeat connections to thesupervisor are dropped, thereby alerting the supervisor to the modulefailure. The supervisor will attempt to reconnect to the module to checkif the failure was just a connection issue or a module failure. In someembodiments, failure to reconnect is assumed to be a module failure.

The supervisor will first request the associated node manager to restartthe module in place. Starting the module in place does not incur thecost of re-staging the module and any corresponding software or data, socan be accomplished more quickly than staging and starting on anothernode. However, in this example the problem is due to some resourceunavailability on the specified node, thus the restart will also fail.

After making a predetermined number of attempts to restart the module inplace, the supervisor will look for another suitable node to start themodule on. The supervisor will contact a dependency manager to acquirethe correct package required to deploy the failed module. The supervisorwill then pass that package on to the node manager for the newlyselected node to stage and run the module. The module finds the requiredresources on the new node and creates a heartbeat connection to thesupervisor indicating it is running properly. The supervisor marks thefunctionality as restored and the event is over.

Example #2 illustrates a total node fail such as a failed power supply.In this case the node manager and all modules on the server drop theirheartbeat connections to the supervisor. The supervisor recognizes thisas a complete node failure and marks that node as failed andunavailable. The supervisor then walks through the list of modules thatwere allocated to that node. For each module in that list the supervisorwill look for another suitable node to start the module on. Thesupervisor will contact a dependency manager to acquire the correctpackage required to deploy the current module. The supervisor will thenpass that package on to the node manager for the newly selected node tostage and run the module. The module executes and creates a heartbeatconnection to the supervisor indicating it is running properly. Thesupervisor marks the functionality as restored for that module. Thiscontinues until all modules have been reallocated to new nodes and theevent is over.

What is claimed is:
 1. A method comprising: sending, by a first node, afirst instruction to a second node, wherein the first instructioninstructs the second node to retrieve a package file locally and then tosend the package file to the first node upon retrieval, wherein thepackage file is associated with a third node based on a dependency treebefore the second node receives the first instruction, wherein thepackage file contains a set of dependency files based on the dependencytree, wherein the first node, the second node, and the third node definea cluster hosting an in-memory database; in response to receiving thepackage file from the second node based on the first instruction,sending, by the first node, the package file and a second instruction tothe third node, wherein the second instruction instructs the third nodeto install the set of dependency files from the package file locally. 2.The method of claim 1, wherein the first node sends the firstinstruction to the second node in response to the first node detecting adependency failure on the third node based on the first nodeinterpreting a heartbeat signal received from the third node.
 3. Themethod of claim 2, further comprising: determining, by the first node,if the third node installed the set of dependency files from the packagefile successfully.
 4. The method of claim 1, wherein the dependency treeis based on at least one of a partition, a module, a file, or a record,wherein the in-memory database comprises at least one of the partition,the module, the file, or the record.
 5. The method of claim 1, whereinthe second node fetches a dependency from a data frame and then compilesthe package file.
 6. The method of claim 1, wherein each of thedependency files is determined by the second node via the dependencytree associated with the third node.
 7. The method of claim 6, whereineach of the dependency files is based on a comparative relationship ofthat dependency file with a corresponding dependency file installed on asubset of nodes in the cluster.
 8. The method of claim 1, furthercomprising: receiving, by the first node, a third instruction from aserver external to the cluster, wherein the server is associated with atleast one of the dependency files installed on the third node based onthe second instruction, wherein the third instruction instructs thethird node to perform an update to the package file on the third nodesuch that the at least one of the dependency files is updated via theserver.
 9. A system comprising: a first node; a second node; and a thirdnode, wherein the first node, the second node, and the third node definea cluster hosting an in-memory database, wherein the first node isprogrammed to: send a first instruction to the second node, wherein thefirst instruction instructs the second node to retrieve a package filelocally and then to send the package file to the first node uponretrieval, wherein the package file is associated with the third nodebased on a dependency tree before the second node receives the firstinstruction, wherein the package file contains a set of dependency filesbased on the dependency tree; send the package file and a secondinstruction to the third node in response to receiving the package filefrom the second node based on the first instruction, wherein the secondinstruction instructs the third node to install the set of dependencyfiles from the package file locally.
 10. The system of claim 9, whereinthe first node is programmed to send the first instruction to the secondnode in response to the first node detecting a dependency failure on thethird node based on the first node interpreting a heartbeat signalreceived from the third node.
 11. The system of claim 10, wherein thefirst node is programmed to: determine if the third node locallyinstalled the set of dependency files from the package filesuccessfully.
 12. The system of claim 9, wherein the dependency tree isbased on at least one of a partition, a module, a file, or a record,wherein the in-memory database comprises at least one of the partition,the module, the file, or the record.
 13. The system of claim 9, whereinthe second node fetches a dependency from a data frame and then compilesthe package file.
 14. The system of claim 9, wherein each of thedependency files is determined by the second node via the dependencytree associated with the third node.
 15. The system of claim 14, whereineach of the dependency files is based on a comparative relationship ofthat dependency file with a corresponding dependency file installed on asubset of nodes in the cluster.
 16. The system of claim 15, wherein thefirst node is programmed to: receive a third instruction from a serverexternal to the cluster, wherein the server is associated with at leastone of the dependency files installed on the third node based on thesecond instruction, wherein the third instruction instructs the thirdnode to perform an update to the package file on the third node suchthat the at least one of the dependency files is updated via the server.